AMENDMENT AND RESPONSE UNDER 37 CFR § 1.111 Page 2 

Serial Number: 10/643,742 Diet: 1376.697US 1 

Filing Date: August 18, 2003 
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IN THE SPECIFICATION 

Please amend the title as follows: 

DECOUPLED STORE ADDRESS AND DATA IN A MULTIPROCESSOR SYS TEM 
DECOUPLING OF WRITE ADDRESS FROM ITS ASSOCIATED WRITE DATA IN 
A STORE TO A SHARED MEMORY IN A MULTIPROCESSOR SYSTEM 



Please amend the paragraph beginning on page 1 at line 5, as follows: 

This application is related to U.S. Patent Application No.[[ ]] 

110/643,744 , entitled "Multistream Processing System and Method", filed on even date 

herewith; to U.S. Patent Application No. [[ ]] 10/6443,577 , entitled "System 

and Method for Synchronizing Processing Memory Transfers", Serial No. 

[[ ,]] filed on even date herewith; to U.S. Patent Application No. 

[[ ]] 10/643,586 , entitled "Decoupled Vector Scalar/Vector Computer 

Architecture System and Method (as amended) ", filed on even date herewith; to U.S. Patent 

Application No. [[ ]] 10/643,585 , entitled "Latency Tolerant Distributed 

Shared Memory Multiprocessor Computer", filed on even date herewith; to U.S. Patent 

Application No. [[ ]] 10/643,754 , entitled "Relaxed Memory Consistency 

Model", filed on even date herewith; to U.S. Patent Application No.[[ ]], 

10/643,758 entitled "Remote Translation Mechanism for a Multinode System", filed on even 

date herewith; and to U.S. Patent Application No. [[ ]] 10/643,741 , entitled 

" M e thod and Apparatus for Local Synchronizations in a Vector Processor System Multistream 
Processing Memory-And Barrier- Synchronization Method and Apparatus ", filed on even date 
herewith, each of which is incorporated herein by reference. 
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Please amend paragraph [0012] (beginning on page 3 at line 5) as follows: 

Not all processors [[16]] 12 have to be the same. A multiprocessor computer system 10 
having different types of processors connected to a shared memory 16 is shown in Fig. lb. 
Multiprocessor computer system 10 includes a scalar processing unit 12, a vector processing unit 
14 and a shared memory 16. Shared memory 16 includes a store address buffer 19. 



Please amend paragraph [0043] (beginning on page 9 at line 14) as follows: 

In one embodiment, global memory 26 as shown in Fig. 3 is distributed to each MSP 30 
as local memory [[48]] (not shown in Figures). In one embodiment, local memory is packaged as ' 
a separate chip (termed the "M" chip as shown in Fig. 4, block 26). Each Ecache 24 has four 
ports 34 to M chip [[42]] 26 and connected through M chip 26 to local memory (and through M 
chip 4 2 to local memory 4 8 and to n e twork 38) . In one embodiment, ports 34 are 16 data bits in 
each direction. MSP 30 has a total of 25.6 GB/s load bandwidth and 12.8-20.5 GB/s store 
bandwidth (depending upon stride) to local memory. j 
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